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Abstract 

The theory of compressive sensing (CS) has shown us that under certain conditions, a sparse signal can 
be recovered from a small number of linear incoherent measurements. An effective class of reconstruction 
algorithms involve solving a convex optimization program that balances the f i norm of the solution against 
a data fidelity term. Tremendous progress has been made in recent years on algorithms for solving these 
li minimization programs. These algorithms, however, are for the most part static: they focus on finding 
the solution for a fixed set of measurements. In this paper, we will present a suite of dynamic algorithms 
for solving li minimization programs for streaming sets of measurements. We consider cases where the 
underlying signal changes slightly between measurements, and where new measurements of a fixed signal 
are sequentially added to the system. We develop algorithms to quickly update the solution of several 
different types of li optimization problems whenever these changes occur, thus avoiding having to solve 
a new optimization problem from scratch. Our proposed schemes are based on homotopy continuation, 
which breaks down the solution update in a systematic and efficient way into a small number of linear 
steps. Each step consists of a low-rank update and a small number of matrix-vector multiplications - 
very much like recursive least squares. Our investigation also includes dynamic updating schemes for 
li decoding problems, where an arbitrary signal is to be recovered from redundant coded measurements 
which have been corrupted by sparse errors. 

Index Terms 

Homotopy, sparse signal recovery, recursive filtering, compressive sensing, (.^ norm minimization, 
t\ decoding, LASSO, Dantzig selector. 
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I. Introduction 

Recovering a signal from a set of linear measurements is a fundamental problem in signal processing. 
We are given measurements y G M™ of the form 

y = Ax + e, (1) 

where is an m x ra matrix and e is a noise vector. From these, we wish to reconstruct the unknown 
signal X G R". The classical solution to this problem is to estimate x from y using least-squares. Given 
y, we solve 

miiumize \\Ax — (2) 

or when A is ill-conditioned 

minimize — 2/111 + ''"lla^IlL 0) 

where r > is a regularization parameter. Each of these minimizers can be found by solving a system 
of Unear equations. We can interpret the solution to (3) as the estimate which, depending on the value of 
r, strikes a balance between the data fidelity (we want the energy in the mismatch between the simulated 
measurements Ax of our estimate and the true measurements y to be small) and the complexity of the 
estimate (among all estimates with the same measurements, we want the one with minimal energy). 

Recent developments in the theory of compressive sensing (CS) have shovm us that under certain 
conditions, dramatic gains can be had by promoting sparsity instead of miiumizing energy. There are two 
classes of problems: 

CS: In this case, the matrix A is underdetermined, and the signal x is sparse. To promote sparsity in the 
solution, we penalize the £i norm of the estimate, solving 

miiumize ^j^x — + t||x||i. (4) 

For certain types of measurement matrices (namely, matrices that obey a type of uncertainty principle 
[1]), (4) comes with a number of performance guarantees [2]-[7]. In particular, if x is sparse enough and 
there is no noise, (4) will recover x exactly as r — >^ even though A is underdetermined; the recovery can 
also be made stable when the measurements are made in the presence of noise with an appropriate choice 
of r. There are also several variations on (4) which use slightly different penalties for the measurement 
error. We will also be interested in one of these variations, the Dantzig Selector [8] given in (10) below. 
Decoding: In this case, the matrix A is overdetermined, and the error e is sparse. To account for this, 
we solve 

miiumize ||Aa; — (5) 



3 



in place of (2). There are again a number of performance guarantees for (5) that relate the number of 
errors we can correct (number of non-zero entries in e) to the number of measurements we have collected 
(rows in A) [9], [10]. If the matrix consists of independent Gaussian random variables, then the number 
of errors we can correct (and hence recover x exactly) scales with the amount of oversampling m — n. 

These i\ minimization programs are tractable, but solving them is more involved than least-squares. 
In this paper, we will be interested in how solutions to these problems change as 1) the signal we are 
measuring changes by a small amount, and 2) new measurements of the signal are added. We will present 
a suite of algorithms that avoid solving these programs from scratch each time we are given a new set 
of measurements, and instead quickly update the solution. We will constrain our discussion to small and 
medium scale problems, where the matrices are stored explicitly and linear systems of equations are 
solved exactly (within machine precision) using direct methods. We begin with a brief review of how 
updating works in the least-squares scenario. 

A. Update of least-squares 

When the mxn matrix A has full column rank (is overdetermined), the least squares problem (2) has 
a unique solution xq found by solving a system of linear equations: 

xo = {A^A)-'A^y. (6) 

There is a variety of ways to compute xq, including iterative methods that have the potential to return an 
approximate solution at relatively low cost, but in general the computational cost involved for an exact 
solution is O(mn^). Typical direct methods for solving (2) involve Cholesky or QR decompositions [11], 
[12]. If we have already computed the QR factorization for A (or Cholesky factorization for A^A), then 
there is not much marginal cost in recovering additional signals measured with the same matrix A. We 
can simply use the already computed factorization for the new set of measurements at a cost of 0{mn). 

There is also an efficient way to update the solution if we add (or remove) a small number of 
measurements to the system. Assume that we have solved (2) using (6) to get the estimate xq with our 
current set of m measurements y. Now suppose that we get one new measurement given as w = bx + d, 
where 6 is a row vector and d G M denotes noise in the new observation. The system of equations 
becomes 

V A e 

(7) 
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and the least-squares solution xi obeys: {A^ A + b^b)xi = {AJ'y + b^w). A naive way to compute xi 
would be to solve this new system of equations from scratch. But we can avoid this computationally 
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expensive task by using rank-l updates, reducing the cost of computing the new solution from O(mn^) 
to 0{mn). 

The new solution xi can be written in terms of the previous solution xq using the matrix inversion 
lemma (also known as the Sherman-Woodbury-Morrison formula) ^. The matrix {A^A + b'^b)~^ can be 
computed from {A'^A)~^ in the following manner. With Pq = (^^^4)"^, we set 

Pi := {A^A + 6^6)-^ = Po - KibPo, where Ki = Po6^(l + bPob^)''^, (8) 

and the new estimate can be written as 

xi = xo + Ki{w — bxo). (9) 

Note that 1 + bP^b^ in (8) is a scalar, and so the essential cost of the updating procedure is a few 
matrix-vector multiplies. Thus given the new measurement w, we can find the new solution in 0{mn) 
computations. 

The goal of this paper is to develop a similar methodology for updating the solutions to a suite of l\ 
minimization programs. This will not be as straightforward as in the least-squares case, but we will see 
that we can move between solutions using a series of low rank updates similar to (8), (9). 

B. £i problems 

In this section, we give a brief overview of the four types of £i minimization programs for which we 
will develop dynamic updating algorithms. 

A large body of Uterature has arisen around the problem of reconstructing a sparse signal from a 
Umited number of measurements. The essence of this theory, which goes under the name of compressive 
sensing, is that if the mxn matrix A is incoherent, then we can reUably estimate x about as well as if we 
observed its m/logn most significant components directly. The technical conditions for this incoherence 
property basically state that A has to be close to an isometry when it operates on sparse signals [1]. 
There are several manners in which these types of matrices can be generated, the easiest of which is to 
simply draw the entries of A independently from a concentrated (e.g. Gaussian) distribution [1], [5]. 

'in practice, we will want to update the Cholesky or QR factorizations, rather than the explicit inverse of A^A, as the Sherman- 
Woodbury-Morrison formula can become numerically unstable if the matrices are not well-conditioned. Here we discuss the 
update in terms of the expUcit inverses to simplify the exposition, and to separate the main concept — that the solution of a 
system of equations can be efficiently updated — from its implementation. For a detailed discussion of methods for low-rank 
updates, see [12]. 
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We will discuss two optimization programs for sparse signal recovery. The first is (4), which goes by 
the name of basis pursuit denoising (BPDN) [13] in signal processing and is the lagrangian formulation 
of the LASSO [14], a well-known tool for model selection in statistics. Solving (4) is robust in that it 
is stable both in the presence of noise and to the fact that the signal may not be exactly sparse [2]-[4]. 
Methods for computing the solution to BPDN can be found in [13], [15]-[20]. 

Related to, but subtly different than, BPDN is the Dantzig selector (DS) [8]. Instead of requiring that 
the residual Ax — y for a candidate estimate x have small energy, it asks instead that the residual should 
not be too correlated with any of the columns of A. Given the measurements y, the DS solves 

minimize \\x\\i subject to \\A'^{Ax — y)\\oo < t, (10) 

for some relaxation parameter r > 0. For incoherent A, the DS guarantees a near-optimal estimate of a 
sparse signal when the measurements are made in the presence of Gaussian noise. Algorithms for solving 
(10) can be found in [21]-[23]. 

While we can compress a sparse signal by applying an underdetermined incoherent matrix, we can 
also protect a general signal against sparse errors by applying an overdetermined incoherent matrix. If 
we take m = Cn incoherent measurements of an arbitrary signal x, where C > 1, and add a sparse error 
e that has fewer than p{C) -m non-zero terms, where p{C) is a constant that depends on C, then solving 
the optimization program (5) will recover x exactly [9], [10]. This result depends only on the number of 
nonzero terms in e, and not on their magnitude. Another way to interpret the action of ^ is as a channel 
encoding which can correct a certain number of (arbitrarily large) errors. 

This recovery can also be made robust to small errors present throughout all of the measurements [24]. 
Suppose that we measure 

y = Ax + e + qy, (11) 

where ^ is the m x n coding matrix with m > n, e is a sparse error vector (the gross errors), and qy is 
a non-sparse error vector whose entries are relatively small. To account for both types of error, we solve 

minimize T||e||i + ^ ||^jy||2 subject to Ax + e + qy = y, (12) 

which can be rewritten as 

minimize r||e||i -I- ^||Q(e-y)||i, (13) 

where Q is a matrix whose rows span the null space of A^, QA = 0; one particular choice is Q := 

/ — A{A'^ A)'^ A^ . This problem is similar to BPDN, and its solution gives us an estimate e of the error. 
The decoded message can then be found using x = {A'^A)~^A^{y — e). 
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C. Overview 

The goal of this paper is to develop dynamic algorithms for solving these types of ii minimization 
programs. We will characterize how their solutions change when a small number of new measurements 
are added, and (in the case of the BPDN and DS) when the signal changes. In doing this, we will see that 
moving from one solution to the next can be broken down as a series of linear problems which can in turn 
be solved with a series of low-rank updates. Our approach is based on homotopy continuation principle, 
which we describe in Section n. The main idea of the homotopy framework is to slowly change from 
one optimization program to another by varying a (carefully placed) parameter in such a maimer that we 
can trace the solution path. In Section m, we see how to apply this principle to update the solution to 
the BPDN and DS as the signal we are measuring changes. In Sections IV we see how to update the 
solutions to the BPDN and DS when a new measurement is added (the former has been independently 
addressed previously in [25]). Sections V and VI turn to the decoding problem, where we see that there 
are gains to be had by adding measurements in clusters. Section VII contains numerical experiments 
that demonstrate the effectiveness of these algorithms, and compares dynamic updating to state-of-the-art 
£i miiumization algorithms which have been "warm started". MATLAB code for all of the algorithms 
presented in this paper, along with scripts that reproduce the figures, is publicly available [26]. 

II. Homotopy 

Homotopy gives us a continuous transformation from one optimization program to another. The 
solutions to this string of programs lie along a continuous parameterized path. The idea is that while the 
optimization programs may be difficult to solve by themselves, we can trace this path of solutions as we 
slowly vary the parameter. 

A common use for homotopy is to trace the path of solutions as the relaxation parameter changes. In 
this section, we give a brief overview of these methods for BPDN and the DS, as many of the ideas are 
used in our updating algorithms. 

A. Basis pursuit denoising homotopy 

There is an extensively studied [19], [20], [27] homotopy algorithm associated with the BPDN that 
traces the solution to (4) as the parameter r changes. The path is followed by ensuring that certain 
optimality conditions are being maintained. To be a solution to (4), a vector x* must obey the following 
condition [28], [29]: 

P^(^x*-y)||oo<T. (L) 
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X = < 



We can view (L) as a set of n different constraints, one on each entry of the vector of residual correlations 
A^{Ax* — y). In addition, a sufficient condition for the optimahty of x* is that the set of locations for 
which the constraints in (L) are active (i.e. equal to r) will be the same as the support of x* (the set of 
locations for which ,x* is non-zero) [29]. Denoting this set by F, we can write the optimality conditions 
for any given value of r as 
LI. Al{Ax* -y) = -Tz 
hi. \\A^,{Ax* -y)\\oo<T, 

where is the m x |r| matrix formed from the columns of A indexed by F, and z is a |r|-vector 
containing the signs of x* on F. From this we see that x* can be calculated directly from the support F 
and signs z using 

^AfAry^Afy-Tz) on F 

otherwise. 
Thus we can interpret the solution to BPDN as a type of soft-thresholding: given the support F, we first 
project y onto the range of Ar and then we subtract T{A'^Ar)~^z. As we change r, the solution moves 
along a fine with direction {AfAr)~^z until one of two things happens: an element of x* is shrunk 
to zero, removing it from the support of x*, or another constraint in (L) becomes active, adding a new 
element to the support of x*. At these so-called critical points, both the support of x* and the direction 
of the solution path change. Also, at any point on the solution path it is straightforward to calculate how 
much we need to vary r to take us to a critical point in either direction. 

With these facts in hand, we can solve (4) by starting with a very large value of r (i.e., r > ||A^y||oo), 
where the solution is the zero vector, and reduce it to the desired value while hopping from one critical 
point to the next. At each critical point along this path, a single element is either being added to or 
removed from F, and the new direction can be computed from the old using a rank-1 update. Thus 
multiple solutions over a range of r can be calculated at very little marginal cost. 

B. Dantzig selector homotopy 

The homotopy algorithm for the Dantzig selector (DS) is similar in principle to the BPDN homotopy 
[22], [23]. The essential difference in the case of our DS homotopy algorithm is that we have to keep 
track of both the primal and dual solution for (10) as we change r. The dual problem to the DS in (10) 
can be written as 

maximize - (t||A||i + {\A^y)) subject to P^AA||oo < 1, (14) 
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where A G is the dual optimization variable. We can derive the required optimality conditions by 
recognizing that at the solution, the objectives in (10) and (14) will be equal, due to strong duahty 
[30] . This fact, along with the complementary slackness property, means that a primal-dual solution pair 
{x*,X*) to (10) and (14) for any given value of r must satisfy the following optimality conditions [23]: 

DSl. A^^iAx* - y) = Tzx 
DS2. A]:^AX* = -z^ 
DS3. \\A^c{Ax* - y)\\oo < T 
DS4. pf,AA*||oo < 1, 

where and Tx are the supports of x* and A* respectively, Zx and zx are the sign sequences of x* 
and A* on their respective supports. We will call (DS1,DS3) the primal constraints, and (DS2,DS4) the 
dual constraints. From these optimality conditions we can see that the primal and dual solutions can be 
calculated directly using the supports and sign sequences (F^;, T;^, z^, zx). Also we can see that the active 
primal constraints correspond to the support of dual variable and the active dual constraints correspond 
to the support of primal variable. 

With these facts we can develop the homotopy algorithm for DS in a similar way; we start from a large 
value of r (i.e., r > ||v4-^y||oo, where the solution is the zero vector) and reduce r gradually by updating 
the support and sign sequence at every critical point. As we change r, the solution moves along a line 
in the direction —{Af^A-pJ^^zx until one of the two things happens at a new critical point: an element 
in X* shrinks to zero (removing an element from the support of x*) or an inactive primal constraint 
becomes active (adding an element to the support of A*). We call this first phase the primal update. 
This gives us the value of x* at the new critical point but the value of A* is still unknown. So we use 
the information about the change in the support from the primal update phase to find the new value for 
the dual solution A* at this critical point, during which either an existing element in A* shrinks to zero 
(removing an element from the support of A*) or an inactive dual constraint becomes active (adding an 
element to the support of x*). We call this second phase the dual update. For further details on the DS 
homotopy see [23]. 

The homotopy algorithms we discuss below are in many ways similar to the standard BPDN and DS 
homotopy. In each of them we will introduce a homotopy parameter into the optimization program that 
gradually incorporates the new measurements as we vary it from to 1. The path the solution takes 
will again be piecewise linear, and we will jump from critical point to critical point, determining the 
direction to move using modified version of the optimahty conditions L1-L2 and DS1-DS4 above. Each 
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step will be very efficient, requiring only a few matrix- vector multiplications. We start with the problem 
of recovering a time varying sparse signal. 

III. Dynamic update of time varying sparse signal 

In this section we will discuss the problem of estimating a time-varying sparse signal from a series of 
linear measurement vectors. We expect that the signal changes only slightly between measurements, so 
the reconstructions will be closely related. There are many scenarios where this type of problem could 
arise. For example, in real-time magnetic resonance imaging we want to reconstruct a series of closely 
related frames from samples in the frequency domain [31]. Another application is channel equahzation 
in communications, where we are continuously trying to estimate a time varying (and often times sparse) 
channel response [32]. 

Assume that we have solved the BPDN problem (4) for the system in (1) for a given value of r. Now 
say that the underlying signal x changes to x and we get a new set of m measurements given as 

y = Ax + e. (15) 

We are interested in solving the following updated BPDN problem 

minimize r||x||i -|- ^ — y||2, (16) 

for the same value of r. Since we expect that the signal changes only slightly between the measurements, 
the reconstruction will be closely related. Our goal is to avoid solving (16) from scratch, instead using the 
information from the solution of (4) to quickly compute the solution for (16). Similarly we are interested 
in quickly computing the solution of the following updated DS problem 

minimize ||a;||i subject to \\A^ {Ax — y)\\oo < t , (17) 

by using the information from the solution of (10). 

We will develop the homotopy algorithms for updating the solution for (16) and (17) following three 
steps. First, we provide a homotopy formulation for the problem moving from one set of measurements 
to next. Second, we derive the optimality conditions that the solution must obey for each value of 
the homotopy parameter. Finally, we use these optimality conditions to trace the path towards the new 
solution. 
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A. Basis pursuit demising update 

Let us first look at the dynamic update of the solution for the BPDN problem. Our proposed homotopy 
formulation is as follows: 

minimize r||x||i + ^ ^ \\Ax — y\^ + ^11^^ ~ ^llii (18) 

where e is the homotopy parameter. As we increase e from to 1 we move from the solution of the old 
optimization program (4) to the solution of the new one (16). 

By adapting the optimaUty conditions LI and L2 from Section 11, we see that for x* to be a solution 
to (18) at a given values of e we must have 

\\A^{Ax* -{l-e)y-ey)\\oo<T, (19) 

or more precisely, 

Al{Ax* - (1 - e)y - ey) = -rz (19a) 

\\Al.{Ax* -{\-t)y-ey)\\oo<T, (19b) 

where F is the support of x* and z is its sign sequence on F. We can see from (19a) that again the 
solution to (18) follows a piecewise linear path as e varies; the critical points in this path occur when an 
element is either added or removed from the solution x* . 

Suppose that we are at a solution (with support F and signs z) to (18) at some critical value 
of e = efc between zero and one. To find the direction to move, we will examine how the optimality 
conditions behave as e increases by an infinitesimal amount from to e^. The solution at e = 
must obey 

Al{Axl - (1 - eX)y - e+y) = -rz. (21) 
Subtracting (19a) from (21), the difference between the solutions dx = x'^ — Xk will be 

Ae-iA^Arr'Afiy-y) on F 
otherwise, 
where Ae = e^^ — e^. So as e increases from e^, the direction the solution moves is given by 

(A^Arr^Afiy-y) on F 



dx 



dx 



(22) 

otherwise. 



With the direction to move given by (22), we need to find the step-size 6 that will take us to the next 
critical value of e. We increase e from e^, moving the solution away from Xk in the direction dx, until 
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one of the two things happens: one of the entries in the solution shrinks to zero or one of the constraints 
in (19b) becomes active (equal to r). The smallest amount we can move e so that the former is true is 
simply 

e-=rmn(^^) , (23) 
jer \ dx{j) ) ^ 

where min( )+ denotes that the minimum is taken over positive arguments only. For the latter, set 

Pfe = A^{Ax^, -y + €k{y - y)) (24a) 
dk = A^{Adx + y-y). (24b) 

We are now looking for the smallest stepsize Ae so that Pk{j) + Ae • dk{j) = ±t for some j G T^. This 
is given by 

^- = nnn(^^,^±M) 

So the stepsize to the next critical point is 

e = min(^+,e-). (26) 

With the direction dx and stepsize d chosen, the next critical value of e and the solution at that point 
will be 

Cfc+i = ek + 0, Xk+i = Xk + Odx. 

The support for new solution x^+i differs from V by one element. Let 7~ be the index for the minimizer 
in (23) and 7+ be the index for the minimizer in (25). If we chose 6~ in (26), then we remove 7" from 
the support V and the sign sequence z. If we chose 0+ in (26), then we add 7+ to the support, and add 
the corresponding sign to z. 

This procedure is repeated until e = 1. A precise outline of the algorithm is given in Algorithm 1 in 
Appendix A. 

The main computational cost at every homotopy step comes from solving a |r| x |r| system of equations 
to compute the direction in (22), and two matrix- vector multiplications to compute the for the stepsize. 
Since the support changes by a single element from step to step, the update direction can be computed 
using a rank-1 update, as described in Section I-A. As such, the computational cost of each step is 

0{mn). 
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B. Dantzig selector update 

The homotopy algorithm for dynamic update of DS with time varying signals is very similar to the 
BPDN update, with the additional requirement of updating both the primal and dual solutions at every 
homotopy step. Our proposed homotopy formulation is as follows: 

minimize subject to \\A^ {Ax — (1 — e)y — ey)\\oo < t , (27) 

where e is the homotopy parameter. The optimality conditions for any primal-dual solution pair [x* , A*) 
to (27) at a given value of e can be written as 

Al^ {Ax* -{l-e)y-ey) = tzx (28a) 

A^^AX* = -z^ (28b) 

Pfc(Aa;*-(l-e)y-ei/)||oo <r (28c) 

Pfc^Afclloo < 1. (28d) 

a; 

It can be seen from (28a) that the solution x* to (27) follows a piecewise linear path w.r.t. e, and there 
will be some critical points along the homotopy path where the support of x* and/or A* change. 

1) Primal update: Suppose that we are at some critical value of e = e^, with primal-dual solution 
{^ki^k) with support and sign sequence {Vj^^Vx^Zx-^zx). As we change e from to e^, the solution 
changes to = + AeSx, where dx is given as 




otherwise, 



and Ae = — ejfc. If we start to move in the direction dx by increasing e from e^, at some point either 
a primal constraint will be activated in (28c) (indicating addition of a new element to the support of A) 
or an element in x^ will shrink to zero. We select the smallest step size Q, as described in (23), (25) and 
(26), such that one of these two things happens. The new critical value of e will be e/c+i = + ^ and 
the new primal solution will be x^+i = Xk + Odx. 

2) Dual update: As we mentioned in the case of standard DS homotopy, we do not yet have the 
dual solution at this new critical value of e. In the dual update we use the information about the support 
change from the primal update to find the update direction d\ for the dual vector and consequently the 
dual solution A^+i at e = e^+i. Assume that during primal update, a new element entered^ the support 

^If instead an element was removed from support of Xk, we can pick an "artificial" index 7 G and treat it as the new 
element in the support of A with appropriate sign z-y. 
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dX= < 



of A at index 7 with sign Zj. Then using (28b) we can write the update direction as 

-z^iA^^ArJ'^A^^a^ on Tx 
z-y on 7 

otherwise, 
where is the 7th column of A, Zj is the sign of 7th primal active constraint. This direction ensures 
that the dual constraints remain active on and the sign of new non-zero element in = + 6dX 
at index 7 is z-y. As we move our solution Afe in this direction dX by increasing the step size 6 from 0, 
one of two things will happen, either a nonzero element from A^ will shrink to zero or a dual constraint 
in (28d) will become active (indicating addition of a new element in Tx). The smallest step size such 
that an entry in Afe shrinks to zero is simply 

min ( ) . (30) 



The smallest step size such that a constraint in (28d) becomes active is given by 



^" = "Tf'^#''^) ' (31) 
j&n V h{j) -bk{j) J + 



where = A'^AXk and bk = A^AdX. The stepsize for the update of dual solution is 5 = min((5+, 
The new dual solution will be A^+i = A^ + ddX. The primal and dual support is updated accordingly. 
This procedure of primal and dual update is repeated until e = 1. 

IV. Dynamic update with sequential measurements 

In this section we will discuss the homotopy algorithms to update the solutions for BPDN and DS as 
new measurements are added to the system sequentially. Assume that we have solved the BPDN (4) for 
the system in (1) for some given value of r. Then we introduce one new measurement^ w = bx + d as 
described in (7). We now want to solve the following updated problem 

minimize r||a;||i + ^(||74x — + — wp), (32) 

for the same value of r. Similarly for the DS, we want to solve the following updated problem 

minimize ||a;||i subject to \\A^ {Ax — y) + b^ {bx — w)\\oo < t, (33) 

using the information from the solution of (10). 



^We can just as easily remove a measurement by taking e from 1 to in (34) and (42). 



14 



We will use the same three steps discussed in Section m to update the solution; first strategically 
introducing a homotopy parameter, then writing down the appropriate optimality conditions, and finally 
using the optimality conditions to trace a path to the new solution. 

A. Basis pursuit demising update 

Let us first discuss the homotopy algorithm for the dynamic update of sequential measurements. We 
note that a similar version of this algorithm has appeared recently in [25]; we include discussion here 
as it fits nicely into our overall framework, and is closely related to the updating algorithms for the 
time-varying problem in Section m and the robust decoding problem in Section VI. 

We incorporate the new measurement gradually by introducing the parameter e, in the homotopy 
formulation as: 

minimize r||x||i + ^(|| Ax — y||2 + e|6x — -u;!^). (34) 

Again, as e increases from to 1, we will go from the old problem (4) to the new one (32). 

The optimahty conditions LI and L2 from Section n dictate that to be a solution to (34), x* supported 
on r with signs z must obey 

A^{Ax* -y) + eb^{bx* - w) = -tz (35a) 
\\A^.{Ax* -y) + eb^dbx* - w)\\^ < r, (35b) 

Again, we can see the solution follows a piecewise linear path as e varies, and the path changes directions 
at certain critical values of e for which an element is either added or removed from the support of the 
solution. 

Suppose we are at a solution x^ to (34) at one of these critical values of e = e^. Increasing e an 
infinitesimal amount to e^J", we can subtract the optimality condition (35a) at x* = x^ from the condition 
for X* = to get 

-(e+ - ek){AlAY + ep^br)-^b^ibxk - w) on T 
otherwise. 



dx 



where dx = x^ — x^. 

We can simplify this equation using the matrix inversion lemma, separating the step size from the 
update direction. Setting U := A^Ar + ekb^br and u := brU~^b^, we have the following equations for 
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the update direction 

dx 



\J-^h^{hxk -w) on r 

(36) 



otherwise 
As e increases from e^, the solution moves in the direction dx. However, unhke the update in Section in, 
here the amount we move in the direction dx is not proportional to the amount we change e; rather, 
moving from to will move the solution by Q^dx, where 

^ 1 , / + — ^• 

1 + (e^ - <^k)u 

We now need to find the stepsize 0^ that will take us to the next critical point. As we increase e from 
efc (increasing from 0), the solution moves away from x^ in direction dx, until either an existing 
element in Xjt shrinks to zero or one of the constraints in (35b) becomes active. The smallest step-size 
we can take such that an entry shrinks to zero is just 

r=minf^^') , (37) 

To find the smallest step size at which one of the inactive constraints becomes active, first note that as 
we move from to ejj", (35) becomes 

\\A^\A{xk + Qkdx) - y] + e+6^[6(xfe + Q^^dx) - w;]||oo < r. 

Setting 

Pfe = A^{Ax}, -y) + ekb^{bxk - w) (38a) 
4 = {A'^A + ekb^b)dx + b'^{bxk - w), (38b) 

we are looking for the smallest 6k such that Pk{j) + Okdk{j) = ±t for some j G V^. This is given by 

n+ ■ fT-Pk{j) 'r+pk{j)\ 

The stepsize to the next critical point is then 

9 = min{e+,9-), (40) 

and we set 

e.+i = 6, + ^, (41) 

and Xk+i = Xk + Odx. This procedure is repeated until e = 1; pseudocode is given as Algorithm 2 in 
Appendix A. 
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We have to be a little cautious as we are tracking e indirectly through the stepsize 9. In the last step of 
the algorithm, it is possible to choose large enough so that 0/{l — 9u) is extremely large or negative. In 
these situations, we simply reduce the value of 6 until it corresponds to e^+i = 1, marking the endpoint 
of the solution path [33]. 

The main computational cost for each iteration of the algorithm is a rank-1 update for solving a |r| x |r| 
system of equations to find the direction dx, and applications of A and A'^ to find the stepsize. 

B. Dantzig selector update 

The homotopy formulation for (33) is 

minimize subject to {{A^ {Ax — y) + eb^ {bx — w)\\oo < t, (42) 

and the corresponding dual problem is 

maximize - {t\\X\\i + {X, A^y + eb^w)) subject to \\A^ AX + eb^bX\\oo < I, (43) 

where again varying e from to 1 takes us from the old solution to the new one. 

The optimahty conditions for (x*, A*) to be a primal-dual solution pair to (42) and (43) at some fixed 
value of e and r can be written as 

^ {Ax* -y) + ebl^ {bx* - w) = tzx (44a) 

Al^ AX* + ebl^ bX* = -z^ (44b) 

{Ax* -y) + e6fc (6a;* - u;) ||oo < r (44c) 

WAleAX* + e6fc6A*||oo < 1, (44d) 

where and Tx denote the supports of x* and A* respectively, and Zx and zx are the sign sequences 
on their respective supports. 

The procedure to trace the piecewise hnear homotopy path is same as the BPDN update in principle, 
with the additional effort of keeping track of both the primal and dual variables at every homotopy step. 
Assume that we have a solution (x^, A^) at some e = with support and sign sequence (F-r, Vx^Zx^ zxj- 
As we increase e away from to e^, conditions (44a) and (44b) tell us the primal and dual solutions 
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will move according to 

dx= < 



-(e+ - efe)(^f^Ar. + e+b'^^brj-^b'^^{bxk - w) on 

otherwise 



dX 



-(4 - ^^kM^Ar. + 4blM)-'b'^JXk on 
otherwise 
In the exact same manner, as with the BPDN update, the individual step sizes can be separated from the 
update directions using matrix inversion lemma. We can write the solution values at as = x^+Oxdx 
and = \k+0\d\, where Ox and 6\ denote the step sizes and dx and d\ the respective update directions. 
As we increase the step sizes 9x and 6\, e increases and at some point there will be a change in either 
the primal support or the dual support T\. We pick the smallest step size, either 9x or 6\, which 
causes that change, and take primal and dual variables and constraints up to that point. This will give us 
the new critical value of e, the primal or dual solution at that critical point and the change in either Vx 
or V\. Depending on which variable, primal or dual, causes the change in support, we still have some 
room to change the other variable. So using the support update information we will update the other 
variable in a very similar way to the dual update in DS homotopy. For further details, see [26], [34]. 
This procedure is also repeated until e = 1. 

V. ^1 DECODING 

In this section, we will discuss a homotopy algorithm to update the solution to the li decoding problem 
(5) as new measurements are added. We will use the language of a communications system: a transmitter 
is trying to send a message x to a receiver. The message is turned into a codeword by applying A, and 
the received signal y = Ax + e is corrupted by a sparse error vector e. The receiver recovers the message 
by solving (5). If the codeword is long enough (A has enough rows) and the error is sparse enough (not 
too many entries of e are non-zero), the message will be recovered exactly. The receiver will assume 
that the true message has been recovered when the error Ax* — y for the solution to (5) has fewer than 
m — n nonzero terms (in general, the solution will contain exactly m — n terms, and so this degeneracy 
indicates that the receiver has locked on to something special). If the recovered error has exactly m — n 
non-zero terms, the receiver asks the transmitter for more measurements (codeword elements). 

Suppose that the receiver has just solved (5) to get a decoded message, and then p new measurements 



18 



of X are received. The updated system of equations is 



y 




A 




e 






X + 




w 




B 




d 



(45) 



where w represents p new entries in the received codeword, B denotes p new rows in the coding matrix, 
and d is the error vector for the new codeword entries. The receiver now must solve the updated £i 
decoding problem 

minimize ll^a; — y||i + — u)||i. (46) 

These new measurements can be worked into the solution gradually, using the homotopy formulation 

minimize ll^x — y||i + €||i?a; — will. (47) 

As in the Dantzig selector algorithms, we will find it convenient to trace the path of both the primal 
and dual solutions as e increases from to 1. We begin by writing the dual of (47) as 

maximize — X^y — ev^w subject to A^X + eB^v = 0, ||A||oo < 1, ||z^||oo < 1) (48) 

where A G M"* and v eW are the dual optimization variables. 

The optimality conditions for {xk,Xk,i^k) to be a primal/dual solution set at e = can be derived 
as follows. Let := Ax^ — y and d^ := Bx^ — w be the error estimates for the first and second part 
of the codeword; denote their supports by T,. and respectively. Using the fact that the primal and 
dual objectives in (47) and (48) will be equal at their solutions, we get the following conditions for 
(xfc, Afc,i/fc): 

Afc = sign{Axk -y) on Tg, ||Afc||oo < 1 on (49a) 
= sign(5a;fe - w) on T^, llj^feHoo < 1 on (49b) 
A^Xk + ekB'^Uk = 0. (49c) 

The algorithm for tracking the solution to (47), (48) as e moves from to 1 consists of an initialization 
procedure followed by alternating updates of the primal and dual solution. The critical points along the 
homotopy path correspond to the values of e when an element enters or leaves the support of the estimate 
of the sparse error vector [e^ d^^^ . We describe each of these stages below. 
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Initialization: We will use xq, Xq to denote the old primal and dual solutions at e = 0; the old error 
estimate for the first m codeword elements is eo := Axq — y. We initialize the error estimate for the next 
p elements as do := Bxq — w. In general, if we have not yet recovered the underlying message, all of 
the terms in cIq will be non-zero. Throughout the algorithm, we will use T as the index set for the error 
locations over all m+p codeword elements; we initialize it with F = {r,. UF^}, where is the support 
of Co, and is the support of do- The dual variable corresponding to these new measurements will 
start out as uq = sign(Ba;o — w). Apart from keeping track of the support of the current error estimate, 
we will also find it necessary to keep track of which elements from the second part of the error d have 
left the support at some time. To this end, we initialize a set r„ = F^, and when an element of d shrinks 
to zero, we remove it from F„ (we will never grow F„). 

Every step of the homotopy algorithm for £i decoding can be divided into two main parts: primal and 
dual update. Assume that we already have primal-dual solutions {xk,Xk,J^k) for the problems in (47) 
and (48) at e = e^, with supports F (corresponding to all non zero entries in the error estimates) and 
F„ (corresponding to entries of d which remained non-zero throughout the homotopy path so far). Let 
Cfe := Axk — y and dk ■= Bxk — w be the current error estimates. 

1) Dual update: Assuming that the current error estimate has exactly n terms which are zero (so F 
has size m + p — n and F*^ has size n), exactly n entries in the dual vector (A^, v^) will have magnitude 
less than 1. Thus, there are n degrees of freedom for which the dual solution can move during one step 
of the update; we will exercise this freedom by manipulating the dual coefficients on the set F'^. 

If we combine both parts of the coding matrix together as G := B^] and both parts of the dual 
vector together as '■= [A^ ^D^' optimality condition (49c) becomes 



Increasing e from to e^, this condition for the new dual solution = + can be written as 



where is supported only on the set F'^. Since F„ c F and F"^ c T^, using (51), we can write the 



Croiairc +efeGr„[6]r„ =0. 



(50) 



Gr^9^r^ + (e+ - €fc)Grja + ^]r. = 0, 



(51) 



update direction and the step size 6'^ required to change e from to as 



-(Grc)-iGrJ6]r„ on F^ 



(52) 







otherwise 
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= 4- ^k. 



As we increase e from e^, moving the solution in the direction 9^, there will be a point at which an 
element of = (k + ^t^^ ^^^^ become active (equal to +1 or -1) on V^. The smallest step size for this 
to happen can be computed as 



mm 



(53) 



The new values for e and dual vector ^ are given as 

€k+i = ek + e+, ^k+i = ^k + 0^d^- 

Let 7"*" be the index for the miiumizer in (53). This tells us that we have a new element in the estimated 
error vector at index 7+ with sign Zj, same as ^fe+i(7"''). 

2) Primal update: The dual update provides us with a new element in the support of the error estimate. 
As the error estimate will have exactly n entries which are zero until we have recovered the message, 
we know that one of elements currently in F must shrink to zero. This is accomplished by the primal 
update. 

We have the following system of equations at e = 

(54) 

where the old error estimate is supported only on the set F. The dual update has indicated that our 
new error estimate will have a new active term at index 7+, and that the sign of this new term will be 
Zj. Thus we need to update our estimate of the message x such that the new error estimate Ck+i has 
sign [cfe_|_ 1(7+)] = z^ and Ck+i is zero at all other indices in F'^. In other words, an update direction dx 
will satisfy 

[G^{xk + dx)- s]rc = [cfe + e^dc]rc, (55) 
where dc is constrained on the set F"^ as 



A 




y 




e-k 




Xk - 








B 




w 




dk 



del 



(56) 



Zj on 7"'" 
on F^\{7+} 

We will choose above as the smallest value which shrinks an existing element in Ck to zero; it will 
also be the unknown value for the new element in Ck+i at index 7+. 
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Using (55) and (56) we can write the following system of equations to compute the update direction 

dx 



on 7+ 

(57) 

on r'=\{7+} 



where [G ] ^c] corresponds to the rows of G 



indexed by elements in the set T'^. We solve (57) 



A 
B 

to find dx and consequently dc = Cf^dx. The step size associated with dx is 9'^, and as we increase 
0^ from 0, one of the elements in = Ck + 9^dc will eventually shrink to zero. The value of this step 
size can be found with 

jer V dc{j) ) ^ 

which also gives the new value of 0^+1(7"*"). Let us denote 7" as the index corresponding to Q~ . The 
new estimates for the message x and error vector c are given as 

a^ik+i = a;jt + Q~ dx c^+i = + d~dc. 

The support set can be updated as F = [F U 7"'"]\{7~}. If at some point during primal update, an 
element from within F„ is removed, set F„ = F„\{7'~} and ^^+1(7") = £^+1^^+1(7'"). Repeat this 
alternation of the dual and primal updates until e becomes equal to 1. 

The procedure outlined above used two working assumptions. The first is that the error estimate will 
have exactly n zero entries until we recover the original message x. The second is that any n x n 
submatrix formed by picking n rows from the m + p x n coding matrix will be nonsingular. The second 
assumption allows us to calculate the update directions for both the primal and dual; the first ensures 
that this update direction is unique. Both of these assumptions are true with probability 1 if the coding 
matrix is Gaussian or a random projection, and they are true with very high probabihty if the coding 
matrix A is Bernoulli [35]. In addition to this, the condition number of these submatrices will he fairly 
controlled [36]. The algorithm can be extended to properly handle situations where these assumptions 
do not hold, but we will not discuss this here. 

As before, the main computational cost in this algorithm comes from one matrix-vector product to 
compute dc and rank-1 update for solution of a |r| x |r| system to find the update directions d^ and dx. 

VI. Robust £1 decoding 
In practice, we would like a decoding scheme that can handle codewords which have been corrupted 
both by a small number of gross errors and a small amount of ambient noise. In [24], an optimization 
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program similar to (12) (or (13)) was proposed for accomplishing this type of robust error correction. In 
this section we will discuss the updating procedure for these problems as new elements of the codeword 
are received. 

Assume that we have solved (13) for the system in (11) and then we receive p new measurements: 
w = Bx + d + Qw, where B denotes p new rows in the coding matrix, d denotes the sparse errors and 
Qw denotes small noise. The updated system is 



y 




A 




e 












X + 




+ 


w 




B 




d 







the new decoding program becomes 

minimize rdleHi + ||d||i) + ^(||^2/||| + ||g„,|||) subject to Ax + e + qy = y 

Bx + d + Qw = w. 

The homotopy formulation (with parameter e) to work in the new measurement is 

minimize r(||e||i + €||(i||i) + ^(llgj/lll + ll^^lll) subject to Ax + e + qy = y 



(59) 



(60) 



(61) 



Bx + d + Qw = w. 



Similar to (13) we can form a BPDN type equivalent problem to (61): 
minimize r(||e||i + e||J||i) + ^ 















e 




y 








) 




d 




w 





(62) 



where P is the matrix whose rows span the null space of F^, i.e., PF = 0, where F := 

Note that while the decoding problem (62) has the same form as the BPDN, the homotopy formulation 
(61) is significantly different than those in Sections HI and IV. The difference is due to the fact that here 
the size of the sparse entity we wish to estimate (the error) grows with the number of measurements. 

In order to build the homotopy path, we need the optimality conditions for the solution to (62). The 
necessary condition for a pair (e^, d^) to be a solution to (62) at e = is 



P^p(^ 






y 


) 




T 




dk 




w 






efeT 



where ^ denotes the componentwise inequaUty; the last inequalities, involving e^, correspond to the 
non-zero elements in dj^. We collect both parts of the error estimate together as := [e^ d^.]"^ and both 
parts of the measurements as s := [y^ viF]^ . The support of Cfc is given as T := [Tg U r„], where r„ 
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is the index set corresponding to those elements of dk which remain non-zero in Ck and Fg is the index 
set for the remaining non-zero entries in c^. Let Ze and z^, be the sign sequence of on Fg and F„ 
respectively. The optimaUty conditions can now be written as 



Pr,P{ck - S) = -TZe 

PY^P{ck - s) = -ekTZd 



PlP{Ck - s) 



< T. 



(63a) 
(63b) 
(63c) 



We find the update direction by examining these optimality conditions as we increase e a small ways 
from efe. The solution at e = must obey 



and so 



P^P{4 - Ck) 



-TZe 

-e^TZd 



Since and Ck are both supported on the set F, we can write the update direction dc = — Ck and 
associated step size 6k which moves e from to et as 



dc 



on F 



otherwise 



(64) 



Finally, we need to find the stepsize 6 that will take us to the next critical value of e. As we increase 
e from Ck, the solution moves in the direction dc until either an element in cjt shrinks to zero or one 
of the constraints in (63c) become actives (equal to r). The smallest amount we can move e so that an 
element in shrinks to zero is 



min 



-Cfc(j) 
dcij) 



+ 



For the smallest step size that activates a constraint, set 

Pk = P^P{ck - s) 



dk = P'^Pdc, 



(65) 

(66a) 
(66b) 
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and find the smallest 0+ so that Pk{j) + d'^dk{j) = ±r for some j G F'^. In other words, 

^+ = min(^^,^±M)) • (67) 

jer= V 40) -40) y + 

The stepsize to the next critical point is then 

e = mm{9+,9-). (68) 

With the direction dc and stepsize 9 calculated, the next critical value of e is 

9 

Cfc+l — Cfe + 

r 

and the solution (error estimate) at e^+i is 

Cfc+i = Cfe + 6'(9c, 

with one element either entering or leaving the support. 

Repeat this procedure until e becomes equal to 1. If at any point an element of dk from r„ shrinks 
to zero, we remove it from r„ and treat it as if it were an element of (i.e., without homotopy). If 
all the elements in shrink to zero, we will be able to quit. Pseudocode for this procedure is given as 
Algorithm 4 in Appendix A. The final solution c can be used to find the decoded message x using 

x = {F^F)-'F''{s-c). 

The main computational cost involves computing the kernel matrix P in the start and solve (64) for dc 
at each homotopy step. Computing matrix P will cost 0{mn'^) for the first step, and afterwards with each 
new measurement computing any such matrix P will take only a few rank one updates. Since only one 
element changes in T at every homotopy step, the update direction dc can also be computed efficiently 
using few rank one update. 

Our discussion above assumes the invertibility of P^Pr- Recall that P^ is the matrix whose columns 
span the left null space of F, (e.g., P = I — F{F'^F)~^F'^). For P^Pr to be singular requires that 
a vector with sparsity strictly less than m — n + p be in the null space of P. This will not be true for 
generic coding matrices F: if we chose F to be a random projection or iid Gaussian matrix, P^Pr will 
be invertible for all F with |F| < m — n + p with probabihty one. 

VII. Numerical EXAMPLES 

In this section we will discuss some simulation results which demonstrate the efficiency of our proposed 
dynamic update. A MATLAB implementation of each of the algorithms discussed in the paper, along 
with the experiments presented below, is available online at [26]. 
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A. Time varying sparse signals 

We will first look at the update algorithm presented in Section III for reconstructing a series of sparse 
signals. The algorithm is most effective when the support of the solution does not change too much from 
instance to instance. 

In the examples below, we start with a sparse signal a; G M" and its m measurements according to 
the model in (I). We first solve (4) for a given value of r. Then the signal is perturbed slightly to x , a 
new set of m measurements y = Ax + e are taken, and (16) is solved using Algorithm I. In all of the 
examples below, we have used an m x n Gaussian matrix as our measurement matrix A, with all entries 
independently distributed Normal(0, 1/m). 

To gauge how the difference in support will effect the speed of the update, we start with a synthetic 
example. In this first simulation, we start with a sparse signal x which contains ±1 spikes at randomly 
chosen K locations. The measurement vector y is generated as in (1), with e as a Gaussian noise whose 
entries are distributed Normal(0, 0.01^). We solve (4) for a given value of r. Then we modify the sparse 
signal X to get x as follows. First, we perturb the non-zero entries of x by adding random numbers 
distributed Normal(0,0.1^). Then Kn new entries are added to x, with the locations chosen uniformly 
at random, and the values distributed Normal (0, 1). New measurements y := Ax + e are generated, with 
another realization of the noise vector e, and (16) is solved using the DynamicX algorithm (Algorithm I). 

The results of 500 simulations with n = 1024, m = 512, K = m/5 are summarized in Table I. In each 
simulation, Kn was selected uniformly from [0, i^/20]. Several values of r were tested, r = A||A^y||oo 
with A G {0.5, 0.1, 0.05, 0.01}. The experiments were run on a standard desktop PC, and two numbers 
were recorded: the average number of times we needed to apply'* A^ and A (nProdAtA), and the average 
CPU time needed to complete the experiment (CPU). 

Table I also compares DynamicX to three other methods. The first is "Standard BPDN homotopy", 
which resolves (16) from scratch using our own implementation of the homotopy algorithm reviewed in 
Section n (starting r large and gradually reducing it to its desired value). The second is the GPSR-BB 
algorithm [16], which is "warm started" by using the previously recovered signal as the starting point. 
The third algorithm is FPC_AS [37], which is also warm started. The accuracy in GPSR and FPC was 
chosen so that the relative error between the exact solution and their solution was 10-^ We see that 

''Each iteration of the DynamicX algorithm requires an appUcation of A^A along with several much smaller matrix-vector 
multiplies to perform the rank-1 update. Since these smaller matrix- vector multiplies are so much cheaper, the numbers in the 
table include only applications of the full A^A. 



26 



Piecewise Polynomial signal, n = 2048 




200 400 600 800 1000 1200 1400 1600 1800 2000 



Fig. 1. An example of Piecewise smooth signal, sparse in wavelet domain 

DynamicX compares favorably across a large range of r. 

A few comments about Table I are in order. First, the DynamicX solves (16) to within machine 
precision, while both GPSR and FPC are iterative algorithms providing approximate solutions; we 
accounted for this fact by having a rather stringent accuracy requirement. This level of accuracy is 
important for signals which have high dynamic range (some elements of x are much bigger than others). 
However, there are many situations in which less accurate solutions will suffice, and the number of matrix 
products required for GPSR and FPC will be reduced. Second, we feel that the number of applications 
of A^A is a more telling number than the CPU time, as the latter can be affected significantly by the 
implementation. 

Table I also contains results for three other experiments with the following descriptions. 
Blocks: In this experiment, we recover a series of 200 piecewise constant signals of length n = 2048, 
similar to the Blocks signal from WaveLab [38]. We use the Haar wavelet transform to represent the 
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Wavelet coefficients (zoom in) Slices of the image 
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Fig. 2. An image of house (256x256): lower images represent two consecutive slices from this image (on the right) and their 
respective wavelet transform coefficients (on the left) 

signal, and take m = 1024 measurements. Each signal is a slight variation of the last: the discontinuities 
stay fixed, while the levels of the constant regions are perturbed by multiplying by a random number 
uniformly distributed between 0.8 and 1.2. As the signal varies, the signs and locations of the significant 
wavelet coefficients vary as well. 

Piecewise polynomial: This experiment is similar to the Blocks experiment, except that we use a 
piecewise polynomial (cubic) signal and represent it using the Daubechies 8 wavelet transform. A typical 
signal and its wavelet transform are shown in Figure 1. The polynomial functions are perturbed from 
signal to signal by adding small Gaussian random variables to the polynomial coefficients. 
Slices of the House image: In this experiment, we take the 256 column slices of the House image, 
shown in Figure 2, as our sequence of signals, and use the Haar wavelet transform to represent them. As 
the singularities will move slightly from slice to slice, more of the support in the wavelet domain will 
change, making this a more challenging data set than the previous examples. 
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TABLE I 

Comparison of the dynamic update of time-varying sparse signals using the standard BPDN homotopy, 
GPSR AND FPC. Results are given in terms of the number of products with and A, and CPU time. 



Signal type 


A 

(r = Ap^2/||oc) 


DynamicX 
(nProdAtA, CPU) 


Standard Homotopy 
(nProdAtA, CPU) 


GPSR-BB 
(nProdAtA, CPU) 


FPC_AS 
(nProdAtA, CPU) 


n = 1024, 
m = 512, 
K = m/5, 
values: ±1 spikes 


0.5 


(11.84, 0.031) 


(42.05, 0.10) 


(15.34, 0.03) 


(31.29, 0.055) 


0.1 


(12.9, 0.055) 


(154.5, 0.491) 


(54.45, 0.095) 


(103.38, 0.13) 


0.05 


(14.56, 0.062) 


(162, 0.517) 


(58.17, 0.10) 


(102.37, 0.14) 


0.01 


(23.72, 0.132) 


(235, 0.924) 


(104.5, 0.18) 


(148.65, 0.177) 


Blocks 


0.01 


(2.7,0.028) 


(76.8,0.490) 


(17,0.133) 


(53.5,0.196) 


Pew. Poly. 


0.01 


(13.83,0.151) 


(150.2,1.096) 


(26.05, 0.212) 


(66.89, 0.250) 


House slices 


0.005 


(44.69, 0.022) 


(76.85,0.03) 


(220.49, 0.03) 


(148.96, 0.055) 



B. Sequential measurements 

In this experiment our underlying signal x contains ±1 spikes at K randomly chosen locations. The 
mxn measurement matrix A is Gaussian with entries distributed Normal(0, 1/m). We observe y = Ax+e 
with the entries of e iid Gaussian with zero mean and variance 10~^. We start by solving (4) for a given 
value of r. We add one new measurement w = bx+d, where 6 is a row vector whose entries are distributed 
as those in A and d is the additional noise term, and update the solution using the DynamicSeq algorithm 
(Algorithm 2). The results are summarized in Table 11, and are compared as before against the standard 
BPDN homotopy algorithm, GPSR with a warm start, and FPC with a warm start. 

The average number of homotopy iterations taken for the update varies with the sparsity of the solution. 
At large values of r, the solution has a small number of non-zero entries and the update requires something 
like 2 or 3 homotopy steps. For smaller values of r, the solution has many more non-zero terms and the 
number of iterations in the update increases; for example, at r = 0.01||A-^y||oo an average 8 homotopy 
steps were required to incorporate a new measurement. 

C. Robust £i decoding 

Now we will look at an example for the robust error correction update algorithm from Section VI. 
We start with an arbitrary signal x G with n = 150; we generate x by drawing its entries from a 

standard normal distribution. The initial coding matrix A is generated by drawing an rn x n Gaussian 
matrix and orthogonalizing the columns, where m = 300. The sparse error e is added to the codeword 
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TABLE II 

Comparison of Dynamic BPDN update, GPSR and FPC with one new measurement 



Signal type 


A 

(r = Ap^2/|U) 


DynamicSeq 
(nProdAtA, CPU) 


Standard Homotopy 
(nProdAtA, CPU) 


GPSR-BB 
(nProdAtA, CPU) 


FPC_AS 
(nProdAtA, CPU) 


n = 1024, 

m = 512 
K = m/5, 
values: ±1 spikes 


0.5 


(2.43, 0.007) 


(42.1, 0.10) 


(12.21, 0.02) 


(23.84, 0.032) 


0.1 


(4.27, 0.019) 


(151.6, 0.491) 


(40.28, 0.07) 


(104.84, 0.11) 


0.05 


(5.57, 0.024) 


(161.6, 0.537) 


(42.3, 0.072) 


(119.2, 0.12) 


0.01 


(8.3, 0.05) 


(231, 0.929) 


(56.6, 0.095) 


(141.4, 0.145) 



Ax by selecting if = 60 random locations in Ax and setting those values to zero. The small noise qy is 
added to all locations of the codeword; its entries are distributed Normal(0,0.01^). The program (13) is 
solved for r = 0.01 with Q = I- A{A'^ A)-'^ A'^ , giving us an initial solution. We add p new elements 
to the corrupted codeword, deciding whether or not to corrupt any new observation (set it to zero) by 
drawing an independent Bernoulli random variable that has a 10% probability of success. The solution 
is then updated using Algorithm 4. 

TABLE m 

Average number of homotopy steps and CPU time taken to update the robust ii decoding solution. The 

"COLD start" columns CALCULATE THE NEW SOLUTION FROM SCRATCH USING THE STANDARD BPDN HOMOTOPY 
ALGORITHM, WHILE THE "WARM START" COLUMNS USE ALGORITHM 4 TO UPDATE THE SOLUTION. 



New entries 
(P) 


Time per iteration (in sec.) 


Homotopy steps per iteration 


cold start 


warm start 


cold start 


warm start 


1 


0.275 


0.041 


180.33 


16.44 


2 


0.325 


0.078 


182.27 


26.29 


5 


0.292 


0.109 


175.27 


40.05 


10 


0.255 


0.144 


176.15 


58.64 



Table m compares the average number of homotopy steps and CPU time for the update for p G 
{1,2,5,10}. Note that the average number of steps scales favorably with p: adding 10 measurements 
at once requires 58.64 iterations to update the solution (an average of 5.86 per entry), while adding 1 
measurement at a time requires 16.44 iterations on average. Likewise, the average time per entry when 
p = 10 is 0.144/10 = 0.0144 seconds, as compared to 0.041 for p = I. These numbers suggest that it 
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is advantageous to add the measurements in blocks rather than one at a time. 

VIII. Conclusions 

We have presented a suite of homotopy algorithms to quickly update the solution to a variety of 
ii minimization programs. The updates can occur when either new measurements are added to the 
system or the signal we are observing changes slightly. The homotopy methods discussed are simple and 
inexpensive, and promise significantly lower marginal cost than re-solving an entirely new optimization 
program. These methods break the update down into a series of Unear steps. The computational cost of 
each step is a few matrix-vector multiplications, and simulation results show that for reasonably sparse 
signals, only a small number of steps are required for the update. These algorithms are extremely efficient 
in cases where support of the solution does not change much. The numerical results further show that 
for dynamic update, homotopy methods are superior to warm started GPSR and FPC methods. 
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Appendix A 
Pseudo-codes 

Algorithm 1 Dynamic update of time varying sparse signal: DynamicX_BPDN 

Start with eo = at solution xq to (4) with support T and sign sequence z on the F for A; = 0. 

repeat 

compute dx as in (22) 

compute pk, dk as in (24) and 9 as in (26) 

Xfc+i = Xk + Odx 

Cfc+l = Cfc + ^' 

if ek+i > 1 then 

e = i-ek 

Xk+1 = Xk + Odx 

Cfc+l = 1 

break; {Quit without any further update} 

end if 

if 6* = 6*- then 

r-r\{7-} 

else 

r^ru{7+} 

end if 

until stopping criterion is satisfied 
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Algorithm 2 Dynamic update with sequential measurements: DynamicSeq_BPDN 

Start with eo = at solution xq to (4) with support T and sign sequence z on the T for k = 0. 

repeat 

compute dx as in (36) 

compute Pk, dk as in (38) and 9 as in (40) 

Xk+i = Xk + 9dx 
ek+i = ek + 

if efc+i > 1 then 

1 - ffc 



l + {l-ek)u 
Xk+i = Xk + 9dx 

break; {Quit without any further update} 

end if 

if 6' = 6*- then 

r-r\{7-} 

else 

r ^ r u {7+} 

end if 

k*-k+l 

until stopping criterion is satisfied 
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Algorithm 3 ii Decoding Homotopy 

Start at eo = with primal-dual solution xq, Xq, error estimate eo := Axo — y with support Fg. Set 

r„ as the set of indices corresponding to the p new measurements, set do := Bxq — w, and uq := za- 

Set r = [Fe U r„], Co := 

repeat 

Dual update: 

compute as in (52) 

find 0+, 7+ and z-y as described in (53) 

a+i = 6 + 

Cfc+l = (■k + 

if e^+i > 1 then 

= Cfe + 

break; {Quit without any further update} 

end if 

Primal update: 

compute dx from (57), set dc := G^dx 
find 6^ and 7^ as described in (58) 
Xk+i = Xk + O'dx 
Cfc+i = Cfe + 9~dc 

r^[ru7+]\{7-} 

if 7~ e r„ then 

r„ <— r„\{7~} {Treat the corresponding error location without homotopy} 

Cfe+i(7~) = efc+iCifc+i(7~) 

if Tn becomes empty then 

break; {Lucky breakdown} 

end if 
end if 
k^k+1 

until stopping criterion is satisfied 



eo 




Ac 






do 







and G := [A^ B^]. 
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Algorithm 4 Robust £i decoding Homotopy 




compute dc as in (64) 



compute pk, dk as in (66) and as in (68) 



Cfc+i = Cfe + edc 



e 



efe+i = Cifc + - 

T 

if efe+i > 1 then 

e = i-€kT 

Cfc+i = Cfc + 6*90 

Cfc+l = 1 

break; {Quit without any further update} 

end if 

if 6* = 6*- then 




{Treat the corresponding error location without homotopy} 



break; 



{Lucky breakdown} 



end if 



end if 



else 



r ^ r u {7+} 



end if 




{Decoded dataword} 



